すしですか??: A Collaborative Filtering Approach to Creating Recommendations from Sushi User-Preference Data

EXECUTIVE SUMMARY

User-based collaborative filtering is a recommendation system that seeks to recommend items to a target user based on recommended items of similar users. This technique was used in the sushi dataset to gather recommendations. The dataset consisted of top 5 sushi ranking preference per user and these were used as a basis of recommendations. The surprise scikit library was used utilizing the KNNBaseLine algorithm which gave the least RMSE of 1.141373. Aside from explicit ratings, a ranking preference can be used as a basis for the creation of user-based collaborative recommender systems.

1. INTRODUCTION

Recommender system suggests specific items that is expected to be interesting or preferred by specific users based on their rated items. These recommendation systems play a very important nowadays in the business landscape especially now that there lots of transactions. For this project, a recommendation system technique known as User-Based Collaborative Filtering is employed.

In user-based recommendation systems, similar users are identified to the targeted user who needs a recommendation. This is very prominent in several applications like Netflix (for show recommendations), or Zomato (for food recommendations).

Most, if not all, recommendation systems employ the Semantic Differential method to measure a user's preference. In this method, users specify identify what items they like through a ranking system, similar to the one below:

                                      highly preferred [5 4 3 2 1] not preferred
Fig 1. Sample Ranking Criterion

For this project, we will try to see if this ranking method is effective in gathering recommendations outside of the preferred items of the user.

1.1 Problem Statement

In the creation of recommendation systems, plenty of inputs can be considered as a basis for recommendation. One can use a binary preferential system, wherein users simply put 1 if they like the item, or 0 if otherwise. It could be better to use a ranking data on seen data to gain recommendations from a completely unseen data set and it is positive that the recommendation will be liked. However, the caveat is that it is impossible to find out if it is successful unless a follow-through questioning to the user is done (Do you like the item recommended to you?). Nonetheless, recommendations borne out of this method may be a good baseline.

This project then seeks to answer the following question: Can a recommendation system be made out of ranking preference data of users?

1.2 Motivation

Interestingly, recommendation systems can bring out insights about a culture which can be utilized. Such is the case with analyzing Sushi.

Sushi, whose origin comes from Japan, is a well known food in the said country. For us outside Japan, we only encounter a few types of sushi. But in Japan, there are plenty of types of Sushi that differ in each location and prefecture.

Fig 2. Sushi

Each type of sushi carries with it a unique ingredient. Listed below are the sushis that will be used in this study.

index Sushi Name Ingredients
0 ebi (shrimp)
1 anago (sea eel)
2 maguro (tuna)
3 ika (squid)
4 uni (sea urchin)
5 tako (octopus)
6 ikura (salmon roe)
7 tamago (egg)
8 toro (fatty tuna)
9 amaebi (AMA shrimp)
10 hotategai (scallop)
11 tai (sea bream)
12 akagai (ark shell)
13 hamachi (young yellowtail)
14 awabi (abalone)
15 samon (salmon)
16 kazunoko (herring roe)
17 shako (squilla)
18 saba (mackerel)
19 chu_toro (mildly_fatty tuna)
20 hirame (flatfish)
21 aji (horse mackerel)
22 kani (crab)
23 kohada (medium_sized KONOSHIRO gizzard shad)
24 torigai (TORI_clam)
25 unagi (eel)
26 tekka_maki (tuna roll)
27 kanpachi (amberjack)
28 mirugai (MIRU_clam)
29 kappa_maki (cucumber roll)
30 geso (squid feet)
31 katsuo (oceanic bonito)
32 iwashi (sardine)
33 hokkigai (HOKKI-clam)
34 shimaaji (hardtail)
35 kanimiso (crab liver)
36 engawa (flesh from around the base of the dorsal and ventral fins of a flounder or flatfish)
37 negi_toro (fatty flesh of tuna minced to a paste and mixed with chopped green leaves of Welsh onions)
38 nattou_maki (fermented bean roll)
39 sayori (halfbeak)
40 takuwan_maki (DAIKON pickles roll)
41 botanebi (BOTAN shrimp)
42 tobiko (flying fish roe)
43 inari (fried tofu wrapper; http://en.wikipedia.org/wiki/Sushi)
44 mentaiko (chili cod roe)
45 sarada (salad)
46 suzuki (sea bass)
47 tarabagani (king crab)
48 ume_shiso_maki (pickled plum & perilla leaf roll)
49 komochi_konbu (herring roe & sea tangle)
50 tarako (cod roe)
51 sazae (turban shell)
52 aoyagi (meat of a trough shell)
53 toro_samon (fatty tuna & salmon)
54 sanma (Pacific saury)
55 hamo (pike conger)
56 nasu (egg plant)
57 shirauo (Japanese icefish)
58 nattou (fermented bean)
59 ankimo (angler liver)
60 kanpyo_maki (pickled gourd_maki)
61 negi_toro_maki (roll style of no.37)
62 gyusashi (raw beef)
63 hamaguri (clam)
64 basashi (raw horse meat)
65 fugu (blowfish)
66 tsubugai (TSUBU_shell)
67 ana_kyu_maki (sea eel & cucumber roll)
68 hiragai (=tairagi; pen shell)
69 okura (gumbo)
70 ume_maki (pickled plum roll)
71 sarada_maki (salad roll)
72 mentaiko_maki (chili cod roe roll)
73 buri (yellowtail)
74 shiso_maki (perilla leaf roll)
75 ika_nattou (squid & fermented bean)
76 zuke (tuna pickled in soy sauce)
77 himo (part of clam)
78 kaiware (DAIKON radish sprouts)
79 kurumaebi (prawn)
80 mekabu (part of tangle)
81 kue (kind of cabrilla)
82 sawara (Japanese Spanish mackerel)
83 sasami (kind of raw chicken)
84 kujira (whale)
85 kamo (wild duck)
86 himo_kyu_maki (part of clam & cucumber roll)
87 tobiuo (flying fish)
88 ishigakidai (ishigaki sea bream)
89 mamakari (Japanese scaled sardine)
90 hoya (ascidian)
91 battera (OSHIZUSHI style mackerel)
92 kyabia (caviar)
93 karasumi (dried mullet roe)
94 uni_kurage (sea urchin & jellyfish)
95 karei (flounder)
96 hiramasa (something like amberjack)
97 namako (sea cucumber)
98 shishamo (smelt)
99 kaki (oyster)

Fig 3. Sushis and Their Ingredients